108 research outputs found
Optimal coding and the origins of Zipfian laws
The problem of compression in standard information theory consists of
assigning codes as short as possible to numbers. Here we consider the problem
of optimal coding -- under an arbitrary coding scheme -- and show that it
predicts Zipf's law of abbreviation, namely a tendency in natural languages for
more frequent words to be shorter. We apply this result to investigate optimal
coding also under so-called non-singular coding, a scheme where unique
segmentation is not warranted but codes stand for a distinct number. Optimal
non-singular coding predicts that the length of a word should grow
approximately as the logarithm of its frequency rank, which is again consistent
with Zipf's law of abbreviation. Optimal non-singular coding in combination
with the maximum entropy principle also predicts Zipf's rank-frequency
distribution. Furthermore, our findings on optimal non-singular coding
challenge common beliefs about random typing. It turns out that random typing
is in fact an optimal coding process, in stark contrast with the common
assumption that it is detached from cost cutting considerations. Finally, we
discuss the implications of optimal coding for the construction of a compact
theory of Zipfian laws and other linguistic laws.Comment: in press in the Journal of Quantitative Linguistics; definition of
concordant pair corrected, proofs polished, references update
Morphological complexity of languages refle ts the settlement history of the Americas
Morphological complexity is widely believed to increase with sociolinguistic isolation, and to decrease with language spreads and absorption of L2 adult learner populations. However, this can be assessed only for communities with well-described histories. Morphological complexity has also been shown to be greater in higher-altitude languages, which are often sociolinguistically isolated, so we use altitude as an empirically determinable proxy for sociolinguistics. In past research, only a very few small locations have been surveyed and the measures of complexity used were family-specific and not easily generalizable. We apply several improved measures of complexity and show that the correlation holds, especially in the Andean regions of South America. We discuss the implications of the South American pattern for the settlement of the Americas and post-settlement prehistoric population formation.Peer reviewe
Zipf's law of abbreviation as a language universal
Words that are used more frequently tend to be shorter. This statement is known as Zipfâs law of abbreviation. Here we perform the widest investigation of the presence of the law to date. In a sample of 1262 texts and 986 different languages - about 13% of the worldâs language diversity - a negative correlation between word frequency and word length is found in all cases. In line with Zipfâs original proposal, we argue that this universal trend is likely to derive from fundamental principles of information processing and transfer
The optimality of word lengths. Theoretical foundations and an empirical study
Zipf's law of abbreviation, namely the tendency of more frequent words to be
shorter, has been viewed as a manifestation of compression, i.e. the
minimization of the length of forms -- a universal principle of natural
communication. Although the claim that languages are optimized has become
trendy, attempts to measure the degree of optimization of languages have been
rather scarce. Here we present two optimality scores that are dualy normalized,
namely, they are normalized with respect to both the minimum and the random
baseline. We analyze the theoretical and statistical pros and cons of these and
other scores. Harnessing the best score, we quantify for the first time the
degree of optimality of word lengths in languages. This indicates that
languages are optimized to 62 or 67 percent on average (depending on the
source) when word lengths are measured in characters, and to 65 percent on
average when word lengths are measured in time. In general, spoken word
durations are more optimized than written word lengths in characters. Our work
paves the way to measure the degree of optimality of the vocalizations or
gestures of other species, and to compare them against written, spoken, or
signed human languages.Comment: On the one hand, the article has been reduced: analyses of the law of
abbreviation and some of the methods have been moved to another article;
appendix B has been reduced. On the other hand, various parts have been
rewritten for clarity; new figures have been added to ease the understanding
of the scores; new citations added. Many typos have been correcte
OCT-4 expression in follicular and luteal phase endometrium: a pilot study
<p>Abstract</p> <p>Background</p> <p>The stem cell marker Octamer-4 (OCT-4) is expressed in human endometrium. Menstrual cycle-dependency of OCT-4 expression has not been investigated to date.</p> <p>Methods</p> <p>In a prospective, single center cohort study of 98 women undergoing hysteroscopy during the follicular (n = 49) and the luteal (n = 40) phases of the menstrual cycle, we obtained endometrial samples. Specimens were investigated for OCT-4 expression on the mRNA and protein levels using reverse transcriptase polymerase chain reaction (RT-PCR) and immunohistochemistry. Expression of OCT-4 was correlated to menstrual cycle phase.</p> <p>Results</p> <p>Of 89 women sampled, 49 were in the follicular phase and 40 were in the luteal phase. OCT-4 mRNA was detected in all samples. Increased OCT-4 mRNA levels in the follicular and luteal phases was found in 35/49 (71%) and 27/40 (68%) of women, respectively (p = 0.9). Increased expression of OCT-4 protein was identified in 56/89 (63%) samples. Increased expression of OCT-4 protein in the follicular and luteal phases was found in 33/49 (67%) and 23/40 (58%) of women, respectively (p = 0.5).</p> <p>Conclusions</p> <p>On the mRNA and protein levels, OCT-4 is not differentially expressed during the menstrual cycle. Endometrial OCT-4 is not involved in or modulated by hormone-induced cyclical changes of the endometrium.</p
Adaptive Communication: Languages with More Non-Native Speakers Tend to Have Fewer Word Forms.
Explaining the diversity of languages across the world is one of the central aims of typological, historical, and evolutionary linguistics. We consider the effect of language contact-the number of non-native speakers a language has-on the way languages change and evolve. By analysing hundreds of languages within and across language families, regions, and text types, we show that languages with greater levels of contact typically employ fewer word forms to encode the same information content (a property we refer to as lexical diversity). Based on three types of statistical analyses, we demonstrate that this variance can in part be explained by the impact of non-native speakers on information encoding strategies. Finally, we argue that languages are information encoding systems shaped by the varying needs of their speakers. Language evolution and change should be modeled as the co-evolution of multiple intertwined adaptive systems: On one hand, the structure of human societies and human learning capabilities, and on the other, the structure of language.CB is funded by an Arts and Humanities Research Council (UK) doctoral grant
(reference number: 04325), a grant from the Cambridge Home and European
Scholarship Scheme, and by Cambridge English, University of Cambridge. AV is
supported by ERC grant 'The evolution of human languages' (reference number:
268744). DK is supported by EPSRC grant EP/I037512/1. FH is funded by a
Benefactor's Scholarship of St. John's College, Cambridge. PB is supported by
Cambridge English, University of Cambridge.This is the final version. It first appeared at http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0128254
Precise Black Hole Masses From Megamaser Disks: Black Hole-Bulge Relations at Low Mass
The black hole (BH)-bulge correlations have greatly influenced the last
decade of effort to understand galaxy evolution. Current knowledge of these
correlations is limited predominantly to high BH masses (M_BH> 10^8 M_sun) that
can be measured using direct stellar, gas, and maser kinematics. These objects,
however, do not represent the demographics of more typical L< L* galaxies. This
study transcends prior limitations to probe BHs that are an order of magnitude
lower in mass, using BH mass measurements derived from the dynamics of H_2O
megamasers in circumnuclear disks. The masers trace the Keplerian rotation of
circumnuclear molecular disks starting at radii of a few tenths of a pc from
the central BH. Modeling of the rotation curves, presented by Kuo et al.
(2010), yields BH masses with exquisite precision. We present stellar velocity
dispersion measurements for a sample of nine megamaser disk galaxies based on
long-slit observations using the B&C spectrograph on the Dupont telescope and
the DIS spectrograph on the 3.5m telescope at Apache Point. We also perform
bulge-to-disk decomposition of a subset of five of these galaxies with SDSS
imaging. The maser galaxies as a group fall below the M_BH-sigma* relation
defined by elliptical galaxies. We show, now with very precise BH mass
measurements, that the low-scatter power-law relation between M_BH and sigma*
seen in elliptical galaxies is not universal. The elliptical galaxy M_BH-sigma*
relation cannot be used to derive the BH mass function at low mass or the
zeropoint for active BH masses. The processes (perhaps BH self-regulation or
minor merging) that operate at higher mass have not effectively established an
M_BH-sigma* relation in this low-mass regime.Comment: 21 pages, 14 figures, accepted for publication in the Astrophysical
Journa
The characteristic blue spectra of accretion disks in quasars as uncovered in the infrared
Quasars are thought to be powered by supermassive black holes accreting
surrounding gas. Central to this picture is a putative accretion disk which is
believed to be the source of the majority of the radiative output. It is well
known, however, that the most extensively studied disk model -- an optically
thick disk which is heated locally by the dissipation of gravitational binding
energy -- is apparently contradicted by observations in a few major respects.
In particular, the model predicts a specific blue spectral shape asymptotically
from the visible to the near-infrared, but this is not generally seen in the
visible wavelength region where the disk spectrum is observable. A crucial
difficulty was that, toward the infrared, the disk spectrum starts to be hidden
under strong hot dust emission from much larger but hitherto unresolved scales,
and thus has essentially been impossible to observe. Here we report
observations of polarized light interior to the dust-emiting region that enable
us to uncover this near-infrared disk spectrum in several quasars. The revealed
spectra show that the near-infrared disk spectrum is indeed as blue as
predicted. This indicates that, at least for the outer near-infrared-emitting
radii, the standard picture of the locally heated disk is approximately
correct. The model problems at shorter wavelengths should then be directed
toward a better understanding of the inner parts of the revealed disk. The
newly uncovered disk emission at large radii, with more future measurements,
will also shed totally new light on the unanswered critical question of how and
where the disk ends.Comment: published in Nature, 24 July 2008 issue. Supplementary Information
can be found at
http://www.mpifr-bonn.mpg.de/div/ir-interferometry/suppl_info.pdf Published
version can be accessed from
http://www.nature.com/nature/journal/v454/n7203/pdf/nature07114.pd
AMUSE-Virgo II. Down-sizing in black hole accretion
(Abridged) We complete the census of nuclear X-ray activity in 100 early type
Virgo galaxies observed by the Chandra X-ray Telescope as part of the
AMUSE-Virgo survey, down to a (3sigma) limiting luminosity of 3.7E+38 erg/s
over 0.5-7 keV. The stellar mass distribution of the targeted sample, which is
mostly composed of formally `inactive' galaxies, peaks below 1E+10 M_Sun, a
regime where the very existence of nuclear super-massive black holes (SMBHs) is
debated. Out of 100 objects, 32 show a nuclear X-ray source, including 6 hybrid
nuclei which also host a massive nuclear cluster as visible from archival HST
images. After carefully accounting for contamination from nuclear low-mass
X-ray binaries based on the shape and normalization of their X-ray luminosity
function, we conclude that between 24-34% of the galaxies in our sample host a
X-ray active SMBH (at the 95% C.L.). This sets a firm lower limit to the black
hole occupation fraction in nearby bulges within a cluster environment. At face
value, the active fraction -down to our luminosity limit- is found to increase
with host stellar mass. However, taking into account selection effects, we find
that the average Eddington-scaled X-ray luminosity scales with black hole mass
as M_BH^(-0.62^{+0.13}_{-0.12}), with an intrinsic scatter of
0.46^({+0.08}_{-0.06}) dex. This finding can be interpreted as observational
evidence for `down-sizing' of black hole accretion in local early types, that
is, low mass black holes shine relatively closer to their Eddington limit than
higher mass objects. As a consequence, the fraction of active galaxies, defined
as those above a fixed X-ray Eddington ratio, decreases with increasing black
hole mass.Comment: Accepted for publication in ApJ (no changes wrt v1
Mental health impact among hospital staff in the aftermath of the Nice 2016 terror attack: the ECHOS de Nice study
BACKGROUND: The Nice terror attack of July 14, 2016 resulted in 84 deaths and 434 injured, with many hospital staff exposed to the attack, either as bystanders on site at the time of the attack ('bystander exposure') who may or may not have provided care to attack victims subsequently, or as care providers to victims only ('professional exposure only'). The objective of this study is to describe the impact on mental health among hospital staff by category of exposure with a particular focus on those with 'professional exposure only', and to assess their use of psychological support resources. METHOD: An observational, cross-sectional, multicenter study conducted from 06/20/2017 to 10/31/2017 among all staff of two healthcare institutions in Nice, using a web questionnaire. Collected data included social, demographic and professional characteristics; trauma exposure category ('bystanders to the attack'; 'professional exposure only'; 'unexposed'); indicators of psychological impact (Hospital Anxiety and Depression Scale); PTSD (PCL-5) level; support sought. Responders could enter open comments in each section of the questionnaire, which were processed by inductive analysis. RESULTS: 804 staff members' questionnaires were analysed. Among responding staff, 488 were exposed (61%): 203 were 'bystanders to the attack', 285 had 'professional exposure only'. The staff with 'professional exposure only' reported anxiety (13.2%), depression (4.6%), suicidal thoughts (5.5%); rates of full PTSD was 9.4% and of partial PTSD, 17.7%. Multivariate analysis in the 'professional exposure only' category showed that the following characteristics were associated with full or partial PTSD: female gender (ORâ=â2.79; 95% CIâ=â1.19-6.56, pâ=â0.019); social isolation (ORâ=â3.80; 95% CIâ=â1.30-11.16, pâ=â0.015); having been confronted with an unfamiliar task (ORâ=â3.04; 95% CIâ=â1.18-7.85; pâ=â0.022). Lastly, 70.6% of the staff with 'professional exposure only' with full PTSD did not seek psychological support. CONCLUSION: Despite a significant impact on mental health, few staff with 'professional exposure only' sought psychological support. Robust prevention and follow-up programs must be developed for hospital staff, in order to manage the health hazards they face when exposed to exceptional health-related events such as mass terror attacks. STUDY REGISTRATION: Ethical approval for the trial was obtained from the National Ethics Committee for Human Research (RCBID N° 2017-A00812-51)
- âŠ